Automatic Semantics Using Google
نویسندگان
چکیده
We have found a method to automatically extract the meaning of words and phrases from the world-wide-web using Google page counts. The approach is novel in its unrestricted problem domain, simplicity of implementation, and manifestly ontological underpinnings. The world-wide-web is the largest database on earth, and the latent semantic context information entered by millions of independent users averages out to provide automatic meaning of useful quality. We demonstrate positive correlations, evidencing an underlying semantic structure, in both numerical symbol notations and number-name words in a variety of natural languages and contexts. Next, we demonstrate the ability to distinguish between colors and numbers, and to distinguish between 17th century Dutch painters; the ability to understand electrical terms, religious terms, emergency incidents, and we conduct a massive experiment in understanding WordNet categories; the ability to do a simple automatic English-Spanish translation.
منابع مشابه
Automatic Meaning Discovery Using Google
We present a new theory of relative semantics between objects, based on information distance and Kolmogorov complexity. This theory is then applied to construct a method to automatically extract the meaning of words and phrases from the world-wide-web using Google page counts. The approach is novel in its unrestricted problem domain, simplicity of implementation, and manifestly ontological unde...
متن کاملAutomatic Hashtag Recommendation in Social Networking and Microblogging Platforms Using a Knowledge-Intensive Content-based Approach
In social networking/microblogging environments, #tag is often used for categorizing messages and marking their key points. Also, since some social networks such as twitter apply restrictions on the number of characters in messages, #tags can serve as a useful tool for helping users express their messages. In this paper, a new knowledge-intensive content-based #tag recommendation system is intr...
متن کاملRESTDoc: Describe, Discover and Compose RESTful Semantic Web Services using Annotated Documentations
Development of a semantic web is gaining a lot of traction recently. At the same time, another change is also getting a lot popular on the web a move from complex SOAP based web services to the simpler RESTful services that work over the existing HTTP infrastructure. Various techniques had been proposed to add semantics to RESTful services. But most of these solutions suffer from the fact that ...
متن کاملFrom Strings to Things SAR-Graphs: A New Type of Resource for Connecting Knowledge and Language
Recent research and development have created the necessary ingredients for a major push in web-scale language understanding: large repositories of structured knowledge (DBpedia, the Google knowledge graph, Freebase, YAGO) progress in language processing (parsing, information extraction, computational semantics), linguistic knowledge resources (Treebanks, WordNet, BabelNet, UWN) and new powerful...
متن کاملExploiting Syntactic and Shallow Semantic Kernels for Question Answer Classification
We study the impact of syntactic and shallow semantic information in automatic classification of questions and answers and answer re-ranking. We define (a) new tree structures based on shallow semantics encoded in Predicate Argument Structures (PASs) and (b) new kernel functions to exploit the representational power of such structures with Support Vector Machines. Our experiments suggest that s...
متن کامل